Overview

  • Week 8: The Linear Model
  • TODAY: Extending the Linear Model
  • [3 weeks for Spring vacation]
  • Week 10: Effect sizes
  • Week 11: Consolidation

Linear Models (LM): recap

  • Trying to make predictions about the world
  • Capture the relationship between predictor & outcome
  • Linear model (line) is described by an intercept (b0) and a slope (b1)

\[Outcome = b_0 + b_1\times Predictor_1 + \varepsilon\]

  • Use the model to see:
    • R2: proportion of variance in outcome that model explains
    • t-statistic and associated p-value: is b1 different from 0?
    • Direction of relationship between predictor & outcome

Today’s Topics

  • How good is our model?
    • Our model vs the mean
    • Error in the model
  • Throwing more predictors into the mix!
  • Comparing hierarchical models
  • Comparing predictors in a model
  • Bringing it all together: what can our model tell us?

How good is our model?

  • A good linear model (LM):
    • Explains a lot about our outcome
    • Captures more than using the simplest model possible
    • Doesn’t contain much error in prediction

Keep is simple, stupid: The mean model

  • We want a LM that explains more than the simplest model possible
  • The simplest model is the mean
    • Mean chocolate bars eaten per week is 3.27
    • Predict how much chocolate your neighbour eats a week…
  • Let’s plot this
    • Chocolate eaten is our outcome variable
    • Not a great model!

  • Let’s add a predictor
  • Error in model
    • Between prediction (using mean) and observed data

  • Let’s add a predictor
  • Error in model
    • Between prediction (using mean) and observed data

We can do better, probably…

  • Let’s construct a LM between chocolate liking and eating
    • The line explaining our data with the least error
    • Compare to the line representing our mean model
    • Shows the improvement in prediction from LM (vs mean)

  • Let’s construct a LM between chocolate liking and eating
    • The line explaining our data with the least error
    • Compare to the line representing our mean model
    • Shows the improvement in prediction from fitting a LM

  • Larger difference = greater improvement!
  • We want a LM that explains more than the mean model

Interim summary

  • Is our LM is better than the simplest model possible?
    • The mean model is the simplest model
    • We want a LM that explains more than the mean model
  • But…a LM better than the mean still contains error
  • How much error is okay?
    • Does the model explain more than it doesn’t explain?
    • How well does the LM predict the outcome?

Error in the model

  • No model will fit the data perfectly
  • Fit a line to best capture relationships between variables
  • Want the least error possible
    • Compare predicted and observed data points

 

Bringing it all together: the F-statistic

  • This is the F-statistic!

\[F =\frac{what\ the\ model\ can\ explain}{what\ the\ model\ cannot\ explain}=\frac{signal}{noise}\]

  • We want signal to be as big as possible

  • We want noise to be as small as possible

  • A ratio of variance explained relative to varience unexplained

  • Ratio > 1 means our model can explain more than it cannot explain

  • Associated p-value of how likely we are to find a F-statistic as large as the observed if the null hypothesis is true

Why you gotta complicate things?

 

  • Multiple predictors are not much more complicated!
    • We build models to predict what is happening in the world
  • Simple explainations for complex relationships?
  • Multiple predictors = greater explanatory power

How multiple predictors (don’t really) change the LM equation

\[\begin{aligned}Outcome &= Model + Error\\ Y&=b_0 + b_1\times Predictor_1 + \varepsilon\\ &=b_0 + b_1\times Predictor_1 + b_2\times Predictor_2 + \varepsilon\end{aligned}\]

  • Y: outcome
  • b0: value of outcome when predictors are 0 (the intercept)
  • b1: change in outcome associated with a unit change in predictor 1
  • b2: change in outcome associated with a unit change in predictor 2

How multiple predictors (don’t really) change the LM equation

  • One predictor LM = regression line
  • Two+ predictor LM = regression plane

Puppies, puppies everywhere!

The outcome:

Predictor 1:

Predictor 2:

\[happiness\ =\ b_0 + b_1\times puppies\ +\ b_2\times fluffy + \varepsilon\]

  • Y: happiness
  • b0: value of happiness when puppies and fluffy are 0 (the intercept)
  • b1: change in happiness associated with a unit change in puppies when fluffy is 0
  • b2: change in happiness associated with a unit change in fluffy when puppies is 0

Model fit: R2 value

  • The R2 is 0.51
    • 51% of the variance in happiness was explained by puppies and fluffy ratings
  • This is just in our observed data!
  • Our adjusted R2 value was 0.478
    • If we used the same model with the population, we should be able to explain 48% of the variance in happiness

Interim summary, mark II

  • The LM can be expanded to include additional predictors
  • The model is still described by an intercept (b0)
  • It now includes slopes (bs) for each predictor
  • This creates a regression plane instead of a line
  • We can assess how good this model is
    • F-ratio and associated p-value
    • R2
  • What if we want to compare two linear models?

Comparing linear models

  • We can compare models with different numbers of predictors
    • See which model better captures our outcome
    • The models must be ‘hierarchical’
      • 2nd model has the same predictors as the 1st model plus extra
      • 3rd model has the same predictors as the 2nd model plus extra

Even more puppies!

Predictor 3:

Model 1:

\[\begin{aligned}happiness\ = &\ b_0\ +\\&\ b_1\times puppies\ +\\ &\ b_2\times fluffy + \varepsilon\end{aligned}\]

Model 2:

\[\begin{aligned}happiness\ = &\ b_0\ +\\&\ b_1\times puppies\ +\\ &\ b_2\times fluffy\ +\\ &\ b_3\times dirt + \varepsilon\end{aligned}\]

  • First, which is the better model?
  • Second, how useful is the best model?

With great (predictive) power comes great responsibility

  • Need to assess which is the best model
  • Want to compare Model 1 (2 predictors) and Model 2 (3 predictors)
    • Do extra predictors improve the model?
    • Look at R2-change and F-change

F-change

  • Does Model 2 explain more variance compared to Model 1?
    • Does adding more predictors significantly improve model fit?
  • The F-change statistic is 152.65 with a p-value of < .001
  • Model 2 was significantly better at explaining variability in happiness than Model 1

Relative contribution of predictor variables

  • Now we know Model 2 is the best model…
  • Which predictors explain most variance in the outcome variable?
    • Assess the relative contribution of our predictor variables
    • Look at the b-values and associated p-values for each predictor

Number of puppies…

  • puppies significantly predicted happiness
    • b1 is positive: as puppies increases, happiness increases
    • For every unit increase in puppies, happiness increases by 0.99 units
    • When all other predictors are held constant!

Relative contribution of predictor variables

  • We CANNOT interpret the relative contribution from these bs!
  • We can’t say that puppies caused happiness to change by a greater amount than dirt
    • All predictors were measured in different units
    • We can only compare predictors measured in the same unit (e.g. cm or £)
  • Instead, need to look at standardized versions of our betas
    • These are ‘measured’ in standard deviations
  • Unstandardized b1 = number of units that happiness increases for every unit increase in puppies
  • Standardized b1 = number of standard deviations (SDs) that happiness increases for every standard deviation increase in puppies
  • The standardized beta for puppies (standardized b1) is 1.01
    • As puppies increase by 1 SD, happiness increases by 1.01 SDs
  • The standardized beta for fluffy (standardized b2) is 0.22
    • As fluffy increases by 1 SD, happiness increases by 0.22 SDs
  • The standardized beta for dirt (standardized b2) is -0.71
    • As dirt increases by 1 SD, happiness decreases by -0.71 SDs

Relative contribution of predictors

  • All predictors (bs) are now in the same unit - standard deviations - Now we can directly compare their relative contribution
  • 1.01 is the largest standardized beta
  • So each SD increase in puppies predicts a larger SD change in happiness than the other predictors

Summary

  • Run hierarchical models
  • Assess if either model is better than the mean model:
    • F-statistic and associated p-value
  • Assess which is better model
    • R2-change, F-change statistic and associated p-value
  • Assess which variables significantly predict the outcome
    • t-value and associated p-value
  • Assess which variables have largest contribution to the outcome
    • Standardized bs

Overall summary

  • The LM captures the relationship between at one or more predictors, x, and an outcome, y
    • LM equation: Outcome = b0 + b1 × Predictor 1 + b2× Predictor 2 + bn × Predictor n… + Error
  • Is our model useful? Assess model fit:
    • F-statistic (or F-change) and associated p-value
    • R2 (or R2-change)
  • What does our model tell us?
    • Unstandardized bs, t-statistic and associated p-value
    • Standardized bs

Don’t forget to save this page on your computer, otherwise you will lose all notes!